Goto

Collaborating Authors

 adaptive online learning


Adaptive Online Learning in Dynamic Environments

Neural Information Processing Systems

In this paper, we study online convex optimization in dynamic environments, and aim to bound the dynamic regret with respect to any sequence of comparators. Existing work have shown that online gradient descent enjoys an $O(\sqrt{T}(1+P_T))$ dynamic regret, where $T$ is the number of iterations and $P_T$ is the path-length of the comparator sequence. However, this result is unsatisfactory, as there exists a large gap from the $\Omega(\sqrt{T(1+P_T)})$ lower bound established in our paper. To address this limitation, we develop a novel online method, namely adaptive learning for dynamic environment (Ader), which achieves an optimal $O(\sqrt{T(1+P_T)})$ dynamic regret. The basic idea is to maintain a set of experts, each attaining an optimal dynamic regret for a specific path-length, and combines them with an expert-tracking algorithm. Furthermore, we propose an improved Ader based on the surrogate loss, and in this way the number of gradient evaluations per round is reduced from $O(\log T)$ to $1$. Finally, we extend Ader to the setting that a sequence of dynamical models is available to characterize the comparators.


Adaptive Online Learning

Neural Information Processing Systems

We propose a general framework for studying adaptive regret bounds in the online learning setting, subsuming model selection and data-dependent bounds. Given a data-or model-dependent bound we ask, "Does there exist some algorithm achieving this bound?" We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability. A cornerstone of our analysis technique is the use of one-sided tail inequalities to bound suprema of offset random processes.Our framework recovers and improves a wide variety of adaptive bounds including quantile bounds, second order data-dependent bounds, and small loss bounds. In addition we derive a new type of adaptive bound for online linear optimization based on the spectral norm, as well as a new online PAC-Bayes theorem.


Reviews: Adaptive Online Learning in Dynamic Environments

Neural Information Processing Systems

This paper studies online convex optimization in dynamic environments, where one wishes to control the regret with respect to sequences of comparators, as opposed to a static comparator. The complexity of a comparator sequence is measured by its path length P_T, and the goal is to obtain an optimal regret bound simultaneously for all path lengths. A first bound in this setting was established in the pioneering paper [1], which showed that online gradient descent (OGD, projected on the convex compact domain) with step-size 1/sqrt(T) achieves an at most T {1/2} (1 P_T) regret for all comparator sequences. However, there is a gap between this upper bound and the (T (1 P_T)) {1/2} lower bound on worst-case regret established in Theorem 2 of the present paper, which is the first lower bound for this problem. On the other hand, if the path length P_T one wishes to compare against is known in advance, optimally tuning the step-size of OGD with respect to P_T in the OGD regret bound yields optimal (T (1 P_T)) {1/2} regret for this path length.


Adaptive Online Learning

Foster, Dylan J., Rakhlin, Alexander, Sridharan, Karthik

Neural Information Processing Systems

We propose a general framework for studying adaptive regret bounds in the online learning setting, subsuming model selection and data-dependent bounds. Given a data- or model-dependent bound we ask, "Does there exist some algorithm achieving this bound?" We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability. A cornerstone of our analysis technique is the use of one-sided tail inequalities to bound suprema of offset random processes.Our framework recovers and improves a wide variety of adaptive bounds including quantile bounds, second order data-dependent bounds, and small loss bounds.


Adaptive Online Learning in Dynamic Environments

Zhang, Lijun, Lu, Shiyin, Zhou, Zhi-Hua

Neural Information Processing Systems

In this paper, we study online convex optimization in dynamic environments, and aim to bound the dynamic regret with respect to any sequence of comparators. Existing work have shown that online gradient descent enjoys an $O(\sqrt{T}(1 P_T))$ dynamic regret, where $T$ is the number of iterations and $P_T$ is the path-length of the comparator sequence. However, this result is unsatisfactory, as there exists a large gap from the $\Omega(\sqrt{T(1 P_T)})$ lower bound established in our paper. To address this limitation, we develop a novel online method, namely adaptive learning for dynamic environment (Ader), which achieves an optimal $O(\sqrt{T(1 P_T)})$ dynamic regret. The basic idea is to maintain a set of experts, each attaining an optimal dynamic regret for a specific path-length, and combines them with an expert-tracking algorithm.


Improved Strongly Adaptive Online Learning using Coin Betting

Jun, Kwang-Sung, Orabona, Francesco, Willett, Rebecca, Wright, Stephen

arXiv.org Machine Learning

This paper describes a new parameter-free online learning algorithm for changing environments. In comparing against algorithms with the same time complexity as ours, we obtain a strongly adaptive regret bound that is a factor of at least $\sqrt{\log(T)}$ better, where $T$ is the time horizon. Empirical results show that our algorithm outperforms state-of-the-art methods in learning with expert advice and metric learning scenarios.


Adaptive On-line Learning in Changing Environments

Murata, Noboru, Müller, Klaus-Robert, Ziehe, Andreas, Amari, Shun-ichi

Neural Information Processing Systems

An adaptive online algorithm extending the learning of learning idea is proposed and theoretically motivated. Relying only on gradient flow information it can be applied to learning continuous functions or distributions, even when no explicit loss function is given and the Hessian is not available. Its efficiency is demonstrated for a non-stationary blind separation task of acoustic signals.


Adaptive On-line Learning in Changing Environments

Murata, Noboru, Müller, Klaus-Robert, Ziehe, Andreas, Amari, Shun-ichi

Neural Information Processing Systems

An adaptive online algorithm extending the learning of learning idea is proposed and theoretically motivated. Relying only on gradient flow information it can be applied to learning continuous functions or distributions, even when no explicit loss function is given and the Hessian is not available. Its efficiency is demonstrated for a non-stationary blind separation task of acoustic signals.


Adaptive On-line Learning in Changing Environments

Murata, Noboru, Müller, Klaus-Robert, Ziehe, Andreas, Amari, Shun-ichi

Neural Information Processing Systems

An adaptive online algorithm extending the learning of learning idea is proposed and theoretically motivated. Relying only on gradient flowinformation it can be applied to learning continuous functions or distributions, even when no explicit loss function is given andthe Hessian is not available. Its efficiency is demonstrated for a non-stationary blind separation task of acoustic signals. 1 Introduction Neural networks provide powerful tools to capture the structure in data by learning. Often the batch learning paradigm is assumed, where the learner is given all training examplessimultaneously and allowed to use them as often as desired. In large practical applications batch learning is often experienced to be rather infeasible and instead online learning is employed.